Using the Meeting Graph Framework to Minimise Kernel Loop Unrolling for Scheduled Loops

نویسندگان

Mounira Bachir

David Gregg

Sid Ahmed Ali Touati

چکیده

This paper improves our previous research effort [1] by providing an efficient method for kernel loop unrolling minimisation in the case of already scheduled loops, where circular lifetime intervals are known. When loops are software pipelined, the number of values simultaneously alive becomes exactly known giving better opportunities for kernel loop unrolling. Furthermore, fixing circular lifetime intervals allows us to reduce the algorithmic complexity of our method compared to [1] by computing a new research space for minimal kernel loop unrolling. The meeting graph (MG) [3] is one of the frameworks proposed in the literature which models loop unrolling and register allocation together in a common formal framework for software pipelined loops. Although MG significantly improves loop register allocation, the computed loop unrolling may lead to unpractical code growth. This work proposes to minimise the loop unrolling degree in the meeting graph by making an adaptation of the approach described in [1]. We explain how to reduce the research space for minimal kernel loop unrolling in the context of MG, yielding to a reduced algorithmic complexity. Furthermore, our experiments on SPEC2000, SPEC2006, MEDIABENCH and FFMPEG show that in concrete cases the loop unrolling minimisation is very fast and the minimal loop unrolling degree for 75% of the optimised loops is equal to 1 (i.e. no unroll), while it is equal to 7 when the software pipelining (SWP) schedule is not fixed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Decomposing Meeting Graph Circuits to Minimise Kernel Loop Unrolling

This article studies an important open problem in backend compilation regarding loop unrolling after periodic register allocation. Although software pipelining is a powerful technique to extract fine-grain parallelism, variables can stay alive across more than one kernel iteration, which is challenging for code generation. The classical software solution that does not alter the computation thro...

متن کامل

Extending Loop Unrolling and Shifting for Reconfigurable Architectures

Loops are an important source of optimization. In this paper, we propose an extension to our work on loop unrolling and loop shifting for reconfigurable architectures. By applying unrolling and shifting to a small loop containing a hardware kernel and some software code, we relocate the function calls contained in the loop body such that in every iteration of the transformed loop, software func...

متن کامل

The meeting graph : a new model for loop cyclic register allocation Christine

Register allocation is a compiler phase in which the gains can be essential in achieving performance on new architectures exploiting instruction level parallelism. We focus our attention on loops and improve the existing methods by introducing a new kind of graph. We model loop unrolling and register allocation together in a common framework, called the meeting graph. We expect that our results...

متن کامل

Generic Loop Parallelization for Reconfigurable Architectures

Reconfigurable Computing (RC) is one of the most intensively studied research areas nowadays due to its potential to dramatically increase application performance. RC combines a general purpose processor (GPP) and a Field Programmable Gate Array (FPGA), having the advantages of both hardware performance and software flexibility. Modern real-life applications (such as audio, video, image process...

متن کامل

The meeting graph: a new model for loop cyclic register allocation

Register allocation is a compiler phase in which the gains can be essential in achieving performance on new architec-tures exploiting instruction level parallelism. We focus our attention on loops and improve the existing methods by introducing a new kind of graph. We model loop unrolling and register allocation together in a common framework, called the meeting graph. We expect our results to ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

Using the Meeting Graph Framework to Minimise Kernel Loop Unrolling for Scheduled Loops

نویسندگان

چکیده

منابع مشابه

Decomposing Meeting Graph Circuits to Minimise Kernel Loop Unrolling

Extending Loop Unrolling and Shifting for Reconfigurable Architectures

The meeting graph : a new model for loop cyclic register allocation Christine

Generic Loop Parallelization for Reconfigurable Architectures

The meeting graph: a new model for loop cyclic register allocation

عنوان ژورنال:

اشتراک گذاری